one-shot unsupervised cross domain translation
One-Shot Unsupervised Cross Domain Translation
Given a single image $x$ from domain $A$ and a set of images from domain $B$, our task is to generate the analogous of $x$ in $B$. We argue that this task could be a key AI capability that underlines the ability of cognitive agents to act in the world and present empirical evidence that the existing unsupervised domain translation methods fail on this task. Our method follows a two step process. First, a variational autoencoder for domain $B$ is trained. Then, given the new sample $x$, we create a variational autoencoder for domain $A$ by adapting the layers that are close to the image in order to directly fit $x$, and only indirectly adapt the other layers. Our experiments indicate that the new method does as well, when trained on one sample $x$, as the existing domain transfer methods, when these enjoy a multitude of training samples from domain $A$.
Reviews: One-Shot Unsupervised Cross Domain Translation
In overall, I think this paper proposes a well-designed two steps learning pipeline for one-shot unsupervised image translation. But the explanations about selective backpropagation in the rebuttal are still not clear to me. According to Eq.8-14, it seems that G S and E S are not updated in phase II. But according to the Tab. 1 and the rebuttal, they seem to be selectively updated. I strongly suggest the authors to explain the details and motivation in the method part if this paper is accepted.
One-Shot Unsupervised Cross Domain Translation
Given a single image $x$ from domain $A$ and a set of images from domain $B$, our task is to generate the analogous of $x$ in $B$. We argue that this task could be a key AI capability that underlines the ability of cognitive agents to act in the world and present empirical evidence that the existing unsupervised domain translation methods fail on this task. Our method follows a two step process. First, a variational autoencoder for domain $B$ is trained. Then, given the new sample $x$, we create a variational autoencoder for domain $A$ by adapting the layers that are close to the image in order to directly fit $x$, and only indirectly adapt the other layers.